Harden commit trailer subprocess handling and align trailer I/O paths#2125
Harden commit trailer subprocess handling and align trailer I/O paths#2125
Conversation
Agent-Logs-Url: https://github.com/gitpython-developers/GitPython/sessions/1a855cb6-0111-4f52-b48d-46417aec5bde Co-authored-by: Byron <63622+Byron@users.noreply.github.com>
Agent-Logs-Url: https://github.com/gitpython-developers/GitPython/sessions/1a855cb6-0111-4f52-b48d-46417aec5bde Co-authored-by: Byron <63622+Byron@users.noreply.github.com>
Agent-Logs-Url: https://github.com/gitpython-developers/GitPython/sessions/3d3e7ffc-d3af-478e-9c6c-128731cdd102 Co-authored-by: Byron <63622+Byron@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR refactors how commit trailers are generated/parsed via git interpret-trailers, aiming to harden subprocess handling and to better align trailer processing with commit message encoding behavior.
Changes:
- Centralize
git interpret-trailersinvocation in a newCommit._interpret_trailers()helper and reuse it from both trailer parsing and trailer application. - Adjust trailer parsing (
trailers_list) and trailer application (create_from_tree) to use the shared helper. - Add a test that exercises trailer creation/parsing when
i18n.commitencodingis set to a non-default encoding (ISO-8859-1).
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
git/objects/commit.py |
Introduces _interpret_trailers() helper and updates trailer parse/apply paths to use it. |
test/test_commit.py |
Adds coverage for trailer handling when i18n.commitencoding is configured to ISO-8859-1. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ) | ||
| trailer: str = proc.communicate(str(self.message).encode())[0].decode("utf8") | ||
| trailer = trailer.strip() | ||
| trailer = self._interpret_trailers(self.repo, self.message, ["--parse"]).strip() |
There was a problem hiding this comment.
trailers_list calls _interpret_trailers without passing/using the commit's actual encoding (self.encoding). Since commit messages can be configured via i18n.commitencoding, trailer parsing should encode/decode using that encoding to avoid mis-decoding or UnicodeDecodeError for non-UTF-8 commit messages. Consider passing self.encoding through to _interpret_trailers (or have the helper derive the encoding from the commit/repo config).
| trailer = self._interpret_trailers(self.repo, self.message, ["--parse"]).strip() | |
| proc = Popen( | |
| [Git.GIT_PYTHON_GIT_EXECUTABLE, "interpret-trailers", "--parse"], | |
| stdin=PIPE, | |
| stdout=PIPE, | |
| stderr=PIPE, | |
| cwd=self.repo.working_tree_dir or self.repo.git_dir, | |
| ) | |
| stdout, _stderr = proc.communicate(self.message.encode(self.encoding)) | |
| finalize_process(proc) | |
| trailer = stdout.decode(self.encoding).strip() |
| def _interpret_trailers(cls, repo: "Repo", message: Union[str, bytes], trailer_args: Sequence[str]) -> str: | ||
| cmd = [repo.git.GIT_PYTHON_GIT_EXECUTABLE, "interpret-trailers", *trailer_args] | ||
| proc: Git.AutoInterrupt = repo.git.execute( # type: ignore[call-overload] | ||
| cmd, | ||
| as_process=True, | ||
| istream=PIPE, | ||
| ) | ||
| message_bytes = message if isinstance(message, bytes) else message.encode(cls.default_encoding, errors="strict") | ||
| stdout_bytes, _ = proc.communicate(message_bytes) | ||
| finalize_process(proc) | ||
| return stdout_bytes.decode(cls.default_encoding, errors="strict") |
There was a problem hiding this comment.
_interpret_trailers always encodes/decodes using cls.default_encoding (UTF-8) even when the repo/commit uses a different i18n.commitencoding. This can break trailer handling for commits whose bytes are not valid UTF-8 (e.g., ISO-8859-1), especially if callers pass bytes. Recommend adding an explicit encoding parameter (defaulting to the repo’s configured commit encoding) and using it consistently for both .encode() and .decode().
| stdout_bytes, _ = proc.communicate(str(message).encode()) | ||
| finalize_process(proc) | ||
| message = stdout_bytes.decode("utf8") | ||
| message = cls._interpret_trailers(repo, str(message), trailer_args) |
There was a problem hiding this comment.
create_from_tree has already computed conf_encoding from i18n.commitencoding, but trailer application calls _interpret_trailers without using it. To keep trailer I/O aligned with how the commit will be serialized, pass conf_encoding (or have _interpret_trailers derive it from the repo) when applying trailers.
| message = cls._interpret_trailers(repo, str(message), trailer_args) | |
| message = cls._interpret_trailers( | |
| repo, | |
| str(message), | |
| trailer_args, | |
| conf_encoding=conf_encoding, | |
| ) |
Agent-Logs-Url: https://github.com/gitpython-developers/GitPython/sessions/519084d5-d5e2-4486-a9cc-5c258e596e13 Co-authored-by: Byron <63622+Byron@users.noreply.github.com>
Agent-Logs-Url: https://github.com/gitpython-developers/GitPython/sessions/519084d5-d5e2-4486-a9cc-5c258e596e13 Co-authored-by: Byron <63622+Byron@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| stdout_bytes, _ = proc.communicate(str(message).encode()) | ||
| finalize_process(proc) | ||
| message = stdout_bytes.decode("utf8") | ||
| message = cls._interpret_trailers(repo, str(message), trailer_args) |
There was a problem hiding this comment.
In create_from_tree(), trailers are applied via _interpret_trailers() without passing the repo-configured commit encoding (conf_encoding). This means interpret-trailers I/O is always UTF-8 even when the commit will be serialized using a different encoding, which is inconsistent with trailers_list() and can produce different on-disk bytes. Pass encoding=conf_encoding when calling _interpret_trailers here (and avoid double str() conversion if possible).
| message = cls._interpret_trailers(repo, str(message), trailer_args) | |
| if not isinstance(message, str): | |
| message = str(message) | |
| message = cls._interpret_trailers(repo, message, trailer_args, encoding=conf_encoding) |
| cmd = [repo.git.GIT_PYTHON_GIT_EXECUTABLE, "interpret-trailers", *trailer_args] | ||
| proc: Git.AutoInterrupt = repo.git.execute( # type: ignore[call-overload] | ||
| cmd, | ||
| as_process=True, | ||
| istream=PIPE, | ||
| ) | ||
| message_bytes = message if isinstance(message, bytes) else message.encode(encoding, errors="strict") | ||
| stdout_bytes, _ = proc.communicate(message_bytes) | ||
| finalize_process(proc) | ||
| return stdout_bytes.decode(encoding, errors="strict") |
There was a problem hiding this comment.
_interpret_trailers() starts the subprocess before encoding message to bytes. If message.encode(...) raises (e.g., UnicodeEncodeError under a legacy commitencoding), the git process is left running until GC/AutoInterrupt finalization. Compute message_bytes before launching the process, or wrap subprocess lifetime in try/finally to ensure it is waited/terminated on all failure paths.
| cmd = [repo.git.GIT_PYTHON_GIT_EXECUTABLE, "interpret-trailers", *trailer_args] | |
| proc: Git.AutoInterrupt = repo.git.execute( # type: ignore[call-overload] | |
| cmd, | |
| as_process=True, | |
| istream=PIPE, | |
| ) | |
| message_bytes = message if isinstance(message, bytes) else message.encode(encoding, errors="strict") | |
| stdout_bytes, _ = proc.communicate(message_bytes) | |
| finalize_process(proc) | |
| return stdout_bytes.decode(encoding, errors="strict") | |
| message_bytes = message if isinstance(message, bytes) else message.encode(encoding, errors="strict") | |
| cmd = [repo.git.GIT_PYTHON_GIT_EXECUTABLE, "interpret-trailers", *trailer_args] | |
| proc: Git.AutoInterrupt = repo.git.execute( # type: ignore[call-overload] | |
| cmd, | |
| as_process=True, | |
| istream=PIPE, | |
| ) | |
| try: | |
| stdout_bytes, _ = proc.communicate(message_bytes) | |
| return stdout_bytes.decode(encoding, errors="strict") | |
| finally: | |
| finalize_process(proc) |
Agent-Logs-Url: https://github.com/gitpython-developers/GitPython/sessions/3cc0bd6d-d54d-4299-9a18-1576c2a91c12 Co-authored-by: Byron <63622+Byron@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| try: | ||
| stdout_bytes, _ = proc.communicate(message_bytes) | ||
| return stdout_bytes.decode(encoding, errors="strict") | ||
| finally: | ||
| finalize_process(proc) |
There was a problem hiding this comment.
In _interpret_trailers, proc.communicate() captures stderr but it’s discarded, and then finalize_process(proc) may raise GitCommandError without including the subprocess stderr (pipes may already be drained/closed). Capture stderr_bytes from communicate() and pass it into finalize_process(proc, stderr=stderr_bytes) so failures surface with a useful error message.
| try: | |
| stdout_bytes, _ = proc.communicate(message_bytes) | |
| return stdout_bytes.decode(encoding, errors="strict") | |
| finally: | |
| finalize_process(proc) | |
| stderr_bytes = None | |
| try: | |
| stdout_bytes, stderr_bytes = proc.communicate(message_bytes) | |
| return stdout_bytes.decode(encoding, errors="strict") | |
| finally: | |
| finalize_process(proc, stderr=stderr_bytes) |
| stdout_bytes, _ = proc.communicate(str(message).encode()) | ||
| finalize_process(proc) | ||
| message = stdout_bytes.decode("utf8") | ||
| message = cls._interpret_trailers(repo, str(message), trailer_args) |
There was a problem hiding this comment.
create_from_tree reads i18n.commitencoding into conf_encoding, but the trailer application path calls _interpret_trailers(...) without passing that encoding (so it always encodes/decodes as UTF-8). To keep trailer I/O aligned with the configured commit encoding (and avoid byte-length differences affecting wrapping/formatting), pass encoding=conf_encoding when calling _interpret_trailers here.
| message = cls._interpret_trailers(repo, str(message), trailer_args) | |
| message = cls._interpret_trailers(repo, str(message), trailer_args, encoding=conf_encoding) |
This MR contains the following updates: | Package | Type | Update | Change | OpenSSF | |---|---|---|---|---| | [gitpython](https://github.com/gitpython-developers/GitPython) | dev | patch | `3.1.46` → `3.1.47` | [](https://securityscorecards.dev/viewer/?uri=github.com/gitpython-developers/GitPython) | --- ### Release Notes <details> <summary>gitpython-developers/GitPython (gitpython)</summary> ### [`v3.1.47`](https://github.com/gitpython-developers/GitPython/releases/tag/3.1.47): - with security fixes [Compare Source](gitpython-developers/GitPython@3.1.46...3.1.47) #### Advisories - <GHSA-rpm5-65cw-6hj4> - <GHSA-x2qx-6953-8485> #### What's Changed - Prepare next release by [@​Byron](https://github.com/Byron) in [#​2095](gitpython-developers/GitPython#2095) - Bump git/ext/gitdb from `335c0f6` to `4c63ee6` by [@​dependabot](https://github.com/dependabot)\[bot] in [#​2096](gitpython-developers/GitPython#2096) - DOC: README Add urls and updated a relative url by [@​Timour-Ilyas](https://github.com/Timour-Ilyas) in [#​2098](gitpython-developers/GitPython#2098) - Fix GitConfigParser ignoring multiple \[include] path entries by [@​daniel7an](https://github.com/daniel7an) in [#​2100](gitpython-developers/GitPython#2100) - Switch back from Alpine to Debian for WSL by [@​EliahKagan](https://github.com/EliahKagan) in [#​2108](gitpython-developers/GitPython#2108) - Bump git/ext/gitdb from `4c63ee6` to `5c1b303` by [@​dependabot](https://github.com/dependabot)\[bot] in [#​2106](gitpython-developers/GitPython#2106) - Run `gc.collect()` twice in `test_rename` on Python 3.12 by [@​EliahKagan](https://github.com/EliahKagan) in [#​2109](gitpython-developers/GitPython#2109) - fix: guard AutoInterrupt terminate during interpreter shutdown by [@​lweyrich1](https://github.com/lweyrich1) in [#​2105](gitpython-developers/GitPython#2105) - Improve CI infrastructure for pre-commit by [@​EliahKagan](https://github.com/EliahKagan) in [#​2110](gitpython-developers/GitPython#2110) - Bump the pre-commit group with 5 updates by [@​dependabot](https://github.com/dependabot)\[bot] in [#​2111](gitpython-developers/GitPython#2111) - Upgrade Sphinx for 3.14 support; drop doc build support on 3.8; test 3.14 by [@​EliahKagan](https://github.com/EliahKagan) in [#​2112](gitpython-developers/GitPython#2112) - Fix `Repo.active_branch` resolution for reftable-backed repositories by [@​Copilot](https://github.com/Copilot) in [#​2114](gitpython-developers/GitPython#2114) - docs: warn about GitDB performance with large commits by [@​mvanhorn](https://github.com/mvanhorn) in [#​2115](gitpython-developers/GitPython#2115) - cmd: fix kwarg formatting in docstring example by [@​UweSchwaeke](https://github.com/UweSchwaeke) in [#​2117](gitpython-developers/GitPython#2117) - Bump <https://github.com/astral-sh/ruff-pre-commit> from v0.15.5 to 0.15.8 in the pre-commit group by [@​dependabot](https://github.com/dependabot)\[bot] in [#​2122](gitpython-developers/GitPython#2122) - Add trailer support for commit creation by [@​Krishnachaitanyakc](https://github.com/Krishnachaitanyakc) in [#​2116](gitpython-developers/GitPython#2116) - Harden commit trailer subprocess handling and align trailer I/O paths by [@​Copilot](https://github.com/Copilot) in [#​2125](gitpython-developers/GitPython#2125) - git.cmd.Git.execute(..): fix `with_stdout=False` by [@​ngie-eign](https://github.com/ngie-eign) in [#​2126](gitpython-developers/GitPython#2126) - Make sure that multi-options are checked after splitting them with `shlex` by [@​Byron](https://github.com/Byron) in [#​2130](gitpython-developers/GitPython#2130) - Block unsafe underscored git kwargs / Fix for GHSA-rpm5-65cw-6hj4 by [@​WesR](https://github.com/WesR) in [#​2131](gitpython-developers/GitPython#2131) #### New Contributors - [@​Timour-Ilyas](https://github.com/Timour-Ilyas) made their first contribution in [#​2098](gitpython-developers/GitPython#2098) - [@​daniel7an](https://github.com/daniel7an) made their first contribution in [#​2100](gitpython-developers/GitPython#2100) - [@​lweyrich1](https://github.com/lweyrich1) made their first contribution in [#​2105](gitpython-developers/GitPython#2105) - [@​Copilot](https://github.com/Copilot) made their first contribution in [#​2114](gitpython-developers/GitPython#2114) - [@​mvanhorn](https://github.com/mvanhorn) made their first contribution in [#​2115](gitpython-developers/GitPython#2115) - [@​UweSchwaeke](https://github.com/UweSchwaeke) made their first contribution in [#​2117](gitpython-developers/GitPython#2117) - [@​Krishnachaitanyakc](https://github.com/Krishnachaitanyakc) made their first contribution in [#​2116](gitpython-developers/GitPython#2116) - [@​ngie-eign](https://github.com/ngie-eign) made their first contribution in [#​2126](gitpython-developers/GitPython#2126) - [@​WesR](https://github.com/WesR) made their first contribution in [#​2131](gitpython-developers/GitPython#2131) **Full Changelog**: <gitpython-developers/GitPython@3.1.46...3.1.47> </details> --- - [ ] <!-- rebase-check -->If you want to rebase/retry this MR, check this box --- This MR has been generated by [Mend Renovate](https://github.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0My4xNDEuNSIsInVwZGF0ZWRJblZlciI6IjQzLjE0MS41IiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJkZXBlbmRlbmNpZXMiLCJyZW5vdmF0ZSJdfQ==--> See merge request swiss-armed-forces/cyber-command/cea/loom!486 Co-authored-by: Loom MR Pipeline Trigger <group_103951964_bot_9504bb8dead6d4e406ad817a607f24be@noreply.gitlab.com>
Uh oh!
There was an error while loading. Please reload this page.